Skip to content

Conversation

@vlad-lesin
Copy link
Contributor

@vlad-lesin vlad-lesin commented Oct 30, 2025

The fix is to remove previous to sentinel node instead of sentinel itself.

  • The Jira issue number for this PR is: MDEV-37755

Description

TODO: fill description here

Release Notes

TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

How can this PR be tested?

TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".

If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@vlad-lesin vlad-lesin requested a review from dr-m October 30, 2025 14:54
@CLAassistant
Copy link

CLAassistant commented Oct 30, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@vlad-lesin vlad-lesin changed the base branch from main to 10.6 October 30, 2025 14:56
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

I would suggest some very minor cleanup to fil_space_free_low() and fil_space_t::drop().

Comment on lines 1595 to 1610
*detached_handle = handle;
else
os_file_close(handle);

/* The above mnt.commit_file() call should remove the space from
fil_system.named_spaces, but some pending operations can push it back
to the container again. */
mysql_mutex_lock(&log_sys.mutex);
if (space->max_lsn != 0)
{
space->max_lsn= 0;
fil_system.named_spaces.remove(*space);
}
mysql_mutex_unlock(&log_sys.mutex);

return space;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the added code corresponds to the logic in fil_space_free(). It could be good to mention that in a comment. I was also wondering if the code could be added right after the mysql_mutex_unlock(&fil_system.mutex), before we do anything with handle, to make it more like what fil_space_free() is doing.

As far as I understand, we should be able to safely simplify fil_space_free_low() as well:

diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
index 4e8ddf5f757..08349905b86 100644
--- a/storage/innobase/fil/fil0fil.cc
+++ b/storage/innobase/fil/fil0fil.cc
@@ -861,13 +861,7 @@ static void fil_space_free_low(fil_space_t *space) noexcept
 	/* The tablespace must not be in fil_system.named_spaces. */
 	ut_ad(srv_fast_shutdown == 2 || !srv_was_started
 	      || space->max_lsn == 0);
-
-	/* Wait for fil_space_t::release() after
-	fil_system_t::detach(), the tablespace cannot be found, so
-	fil_space_t::get() would return NULL */
-	while (space->referenced()) {
-		std::this_thread::sleep_for(std::chrono::microseconds(100));
-	}
+	ut_ad(!space->referenced());
 
 	for (fil_node_t* node = UT_LIST_GET_FIRST(space->chain);
 	     node != NULL; ) {
@@ -1604,15 +1598,12 @@ fil_space_t *fil_space_t::drop(ulint id, pfs_os_file_t *detached_handle)
 
   pfs_os_file_t handle= fil_system.detach(space, true);
   mysql_mutex_unlock(&fil_system.mutex);
-  if (detached_handle)
-    *detached_handle = handle;
-  else
-    os_file_close(handle);
-
-  /* The above mnt.commit_file() call should remove the space from
-  fil_system.named_spaces, but some pending operations can push it back
-  to the container again. */
+  /* The above mtr.commit_file(*space, nullptr) should remove the space from
+  fil_system.named_spaces. Before we set the STOPPING_WRITES flag, another
+  concurrent operation could have marked the tablespace dirty again.
+  This clean-up corresponds to fil_space_free(). */
   mysql_mutex_lock(&log_sys.mutex);
+  ut_ad((space->pending() & ~NEEDS_FSYNC) == (STOPPING | CLOSING));
   if (space->max_lsn != 0)
   {
     space->max_lsn= 0;
@@ -1620,6 +1611,11 @@ fil_space_t *fil_space_t::drop(ulint id, pfs_os_file_t *detached_handle)
   }
   mysql_mutex_unlock(&log_sys.mutex);
 
+  if (detached_handle)
+    *detached_handle = handle;
+  else
+    os_file_close(handle);
+
   return space;
 }
 

I tested these changes on top of a merge of 5620ee3 to 759e352, and 200 runs of the test encryption.create_or_replace passed without incident, and so did a run of mysql-test/mtr --suite=encryption,innodb,mariabackup.

Comment on lines 1639 to 1651
if (fil_space_t *space= fil_space_t::drop(id, &handle))
fil_space_free_low(space);
fil_space_free_low(space);
return handle;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic indentation offset should be 2 columns, not 4. This looks like an unintentional change that should be omitted.

…amed_spaces

mtr.commit_file() call in fil_space_t::drop() removes space from
fil_system.named_spaces, but then the space can be inserted in the
container again by some another thread while fil_space_t::drop() is
waiting for pending operations finishing.

The fix is to check and remove a space from fil_system.named_spaces
after all pengind operations on the space are finished. Also the ut_d()
macro is removed for space->max_lsn=0 assignments to avoid repeated
space removing from fil_system.named_spaces.

There is error in ilist::pop_back(). ilist::end() returns sentinel,
and the pop_back() removes sentinel from the list instead of the last
element. The error is fixed in this commit.

Reviewed by Marko Mäkelä
@dr-m dr-m merged commit 7301fba into 10.6 Nov 11, 2025
12 of 13 checks passed
@dr-m dr-m deleted the 10.6-MDEV-37755 branch November 11, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants